Network Cross-Validation for Determining the Number of Communities in Network Data
نویسندگان
چکیده
The stochastic block model and its variants have been a popular tool for analyzing large network data with community structures. In this paper we develop an efficient network cross-validation (NCV) approach to determine the number of communities, as well as to choose between the regular stochastic block model and the degree corrected block model. The proposed NCV method is based on a block-wise node-pair splitting technique, combined with an integrated step of community recovery using sub-blocks of the adjacency matrix. We prove that the probability of under-selection vanishes as the number of node increases, under mild conditions satisfied by a wide range of popular community recovery algorithms. The solid performance of our method is also demonstrated in extensive simulations and two data examples.
منابع مشابه
Ensemble strategies to build neural network to facilitate decision making
There are three major strategies to form neural network ensembles. The simplest one is the Cross Validation strategy in which all members are trained with the same training data. Bagging and boosting strategies pro-duce perturbed sample from training data. This paper provides an ideal model based on two important factors: activation function and number of neurons in the hidden layer and based u...
متن کاملApplication of Open Data for Official Statistics, Case Study Data of Instagram Social Network
Abstract. Open data notion is based on the idea that emphasizes on free access of users to data to reuse them on their own and republish the result far from some restrictions of copyright, patent etc. Due to the ever increasing trend of Information and Communication Technology (ICT), more data is producing every day and this brings brilliant opportunity for National Statistical Offices (NSOs) ...
متن کاملPREDICTION OF LOAD DEFLECTION BEHAVIOUR OF TWO WAY RC SLAB USING NEURAL NETWORK APPROACH
Reinforced concrete (RC) slabs exhibit complexities in their structural behavior under load due to the composite nature of the material and the multitude and variety of factors that affect such behavior. Current methods for determining the load-deflection behavior of reinforced concrete slabs are limited in scope and are mostly dependable on the results of experimental tests. In this study, an ...
متن کاملA Study on the Network Governance System of Crisis Management in Tehran, Iran, Based On Participatory Governance: A Social Network Analysis
Background and objective This study aims to analyze the network governance of safety and crisis management in Tehran by examining the laws of the fourth development plan and emphasizing the participation of key actors, including government institutions, the private sector, non-governmental organizations, and local communities using social network analysis. Method In this study, 22 laws with 101...
متن کاملLong-term Streamflow Forecasting by Adaptive Neuro-Fuzzy Inference System Using K-fold Cross-validation: (Case Study: Taleghan Basin, Iran)
Streamflow forecasting has an important role in water resource management (e.g. flood control, drought management, reservoir design, etc.). In this paper, the application of Adaptive Neuro Fuzzy Inference System (ANFIS) is used for long-term streamflow forecasting (monthly, seasonal) and moreover, cross-validation method (K-fold) is investigated to evaluate test-training data in the model.Then,...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2015